Wavelet Based Fractal Analysis of Airborne Pollen 



M.E. Degaudenzi, CM. Arizmendi 

Depto. de Fisica, Facultad de Ingenieria, Universidad Nacional de Mar del Plata, 
Av. J.B. Justo 4302, 7600 Mar del Plata, Argentina 
(February 5, 2008) 



Abstract 



The most abundant biological particles in the atmosphere are pollen grains 
and spores. Self protection of pollen allergy is possible through the informa- 
tion of future pollen contents in the air. In spite of the importance of airborne 
pollen concentration forecasting, it has not been possible to predict the pollen 
concentrations with great accuracy, and about 25% of the daily pollen fore- 
casts have resulted in failures. Previous analysis of the dynamic characteristics 
of atmospheric pollen time series indicate that the system can be described 
by a low dimensional chaotic map. We apply the wavelet transform to study 
the multifractal characteristics of an airborne pollen time series. We find the 
persistence behaviour associated to low pollen concentration values and to 
the most rare events of highest pollen concentration values. The information 
and the correlation dimensions correspond to a chaotic system showing loss 
of information with time evolution. 
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I. INTRODUCTION 



Pollen allergy is a common disease causing hay fever in 5-10% of the population. Al- 
though not a life threatening disease, the symptoms can be very troublesome, furthemore, 
the costs to the social sector due to pollen related diseases are high. Self protection of hay 
fever patients is possible through the information of future pollen contents in the air |IJ. 

Models to forecast pollen concentration in the air are principally based on pollen and 
atmospheric weather interactions. Several statistical techniques [0-fl, have been used to 
predict future atmospheric pollen concentrations from weather conditions of the day and 
of recent previous days. In spite of these attempts, it has not been possible to predict the 
pollen concentrations with great accuracy, and about 25% of the daily pollen forecasts have 
resulted in failures j|. 

A reason of these failures could be that the methods used in airborne pollen forecasting are 
based in standard linear statistical techniques which don't suit when the phenomenon to 
forecast is esentially non-linear. 

A previous analysis of the dynamic characteristics of time series of atmospheric pollen 
was developed by Bianchi et al. || , through the study of the correlation dimension ||[7j . The 
dimension found was a low and non integer value || , which indicates that the system may be 
described by a nonlinear function of just a few variables relating nearest pollen concentrations 
of the time series. The fact that the correlation dimension found was fractal predicts that this 
function, also called map in nonlinear dynamics can display chaotic behavior under certain 
circumstances. The existence of a low dimensional map suggests possibilities for short-term 
prediction |S| through the use of some nonlinear model. Artificial neural networks have 
been widely used to predict future values of chaotic time series identifying the nonlinear 
model by extracting knowledge from the past || . Very good pollen concentrations forecasts 
were obtained using neural nets |T(J and, in a previous work, the hypothesis that random 
fluctuations appearing in the pollen time series are produced by Gaussian noise was rejected 
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To continue with the characterization of airborne pollen concentrations the next step 
would be to study what kind of correlation is associated with its fluctuations. The Hurst 
exponent H is broadly used as a measure of the persistence of a statistical phenomenon. 
< H < 0.5 points an antipersistent time series, commonly driven by a phenomenon 
called "Noah Effect" (If you see the Bible, the storm changed everything in a moment). It 
characterizes a system that reverses itself more frequently and covers less distance than a 
random walk. 0.5 < H < 1 implies that we are analyzing a persistent time series which 
obeys to the "Joseph Effect" (In the Bible refers to 7 years of loom, happiness and health 
and 7 years of hungry and illness). This system has long memory effects: what happens 
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now will influence the future, so there is a very deep dependence with the initial conditions. 
Persistent processes are common in nature [)T2"| , |13f . If the distribution is homogeneous there 
is an unique a = H, but if it is not there are several exponents a. The most frequent 
a will characterize the series and will play as Hurst exponent H . A very efficient new 
method to obtain the f{a) singularity spectrum of a pollen time series relies on the use of a 
mathematical tool introduced in signal analysis in the early eighties: the wavelet transform. 
The wavelet transform has been proved very efficient to detect singularities and fractals are 
singular functions indeed. Arneodo et al ||14]jl5| developed the wavelet transform modulus 
maxima (WTMM) method as a technique to study fractal objects. In this method the 
wavelet is used as an oscillating variant of the "square" function of a box. WTMM was 
succesfully applied to study fractal properties of diverse systems such as DNA nucleotide 
sequences [0,0 > Modane turbulent velocity signal [ H , IS | and a cool flame experiment jlU 



We apply WTMM to obtain the Hurst exponents H associated with the pollen time series 
as a whole as well as the persistence of the important rare peaks of highest concentrations. 
Another important tool in describing multifractals that are obtained through WTMM are 
the generalized fractal dimensions D q . 

II. EXPERIMENTAL SETUP 

The material used in this work was from our chaos study of pollen series ||. 

Data of airborne pollen concentration were obtained with an automatic and volumetric 
Burkard pollen and spore trap, situated at the roof of the Facultad de Ciencias Exactas y 
Naturales of our University, 12 meters above ground level. The area surrounding the sample 
is typical of Mar del Plata. The great distance from the sampling site to the emission sources 
makes the particular emission spectra not important. 

Ten liters of air per minute were sucked through a 14 x 2 mm 2 orifice, always orientated 
against the wind flow. The sucking rate is checked weekly. Behind the slit, a drum rotates 
at a speed of 2 mm per hour. The particles are collected on a cellophane tape (Melinex), 19 
mm wide, just below the orifice. The sticky collecting surface comprises nine parts vaseline: 
one part paraffin in toluene. The exposed tape is removed from the drum, cut into pieces of 
48 mm, corresponding to 24-h intervals, then embedded into a solution of polivinylalcohol 
(Gelvatol), water and glycerol and covered with a cover glass. Slides were studied as 12 
transects per day. The pollen was counted at a magnification of X400 for the first year cycle 
(August 1987-8) and at X200 for the second (August 1988-9), and corresponding to 13.5 and 
27 min of sampling every 2 h respectively. The method of counting pollen follows that of 
Kapyla and Penttinen [ 2"0fl . Hourly counts were stored in a database file for further analysis. 
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Statistics of hourly counts may be seen in Table 1 and 2 of ||. The concentration values 
correspond to total pollen grains. The main species found were: Cupressus, Gramineae, 
Eucalyptus, Pinace, Chenopodiineae, Plantago, Cyperaceae, Betula, Cruciferae, Compositae 
Tueulflorae, Ambrosia, Ulmus, Umbelliferae, Platanus and Fraxinus. 

III. THE MULTIFRACTAL FORMALISM 

The aim of this formalism is to determinate the f(a) singularity spectrum of a measure 
/i . It associates the Haussdorff Dimension of each point with the singularity exponent a, 
which gives us an idea of the strength of the singularity. 

N a (e)~e-K a \ (1) 

where N e is the number of boxes needed to cover the measure and e is the size of each 
box [0]. 

A partition function Z can be defined from this spectrum (it is the same model as the 
thermodinamic one). 

N(e) 

Z(q,e) = Y,^)~t T(q) fore^O, (2) 
i=i 

where r(q) is a spectrum which arouses by Legendre transforming the f(ot) singularity 
spectrum. 

The spectrum of generalized fractal dimensions D q is obtained from the spectrum r(q) 

D - T(g) (3) 

The capacity or box dimension of the support of the distribution is given by Dq = 
f(a(0)) = — t(0). Di = f(a(l)) = a(l) corresponds to the scaling behavior of 
the information and is called information dimension. For q > 2, D q and the q — 
point correlation integrals are related. 

As we will show in the following section the Wavelet Transform is specially suited to 
analyze a time series as a multifractal. 



IV. WAVELET TRANSFORM 



The Wavelet Transform (WT) [^T],^] of a signal s(t) consists in decomposing it into 
frequency and time coefficients, asociated to the wavelets. The analyzing wavelet ip, by 
means of translations and dilations, generates the so called family of wavelets. 
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The Wavelet Transform turns the signal s(t) into a function T^[s](a, b): 

Ti[s](a,b) = ± [p[—]s(t)dt, (4) 
a J a 

where ip* is the complex conjugate of ip, a is the frequency dilation factor and b, the time 
translation parameter. 

The wavelet to apply must be chosen with the condition: 

[jp(t)dt = 0, (5) 

and to be orthogonal to lower order polynoms 

J t m i){t)dt = < m < n; (6) 

where m is the order of the polynom. In other words, lower order polinomial behavior is 
eliminated and we can detect and characterize singularities even if they are masked by a 
smooth behavior. 

The WT provides a useful tool to the detection of self-similarity or self-affinity in tem- 
poral series. For a value b in the domain of the signal, the modulus of the transform is 
maximized when the frequency a is of the same order of the characteristic frequency of the 
signal s(t) in the neighborhood of b, this last one will have a local singularity exponent 
a(b) e ]n, n + 1[. 

This means that around b 

\ s (t)-P n {t)\~\t-b\ a{ - b \ (7) 
where P n (t) is an n order polynomial, and 

T4a,b)~a a ( b \ (8) 
provided the first n + 1 moments are zero. 

If we have = d^ N \e x ' 2 1 2 ) / dx N , the first iV moments are vanishing. 

The Wavelet Modulus Function |T^[s](a, t)\ will have a local maximum around the points 
where the signal is singular. These local maxima points make a geometric place called 
modulus maxima line C. 

\T^[s](aMa))\ ~ a a (6i(a)) f or a ^ 0, (9) 

where b\ (a) is the position at the scale a of the maximum belonging to the the line C. 
The Wavelet Transform Modulus Maxima Method (WTMM) consists in the analysis of 
the scaling behavior of some partition functions Z(q, a) that can be defined as: 
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Z{q,a) = Y}T< J [s]{aMo))\\ (10) 
and will scale like a T ^ |i~4"jL5|. 

This partition function works like the previously defined partition function for singular 
measures. For q > will prevail the most pronounced modulus maxima and, on the other 
hand, for q < will survive the lower ones. The most pronounced modulus take place when 
very deep singularities are detected, while the others correspond to smoother singularities. 
We can get r(q) (Eq. 2) and obtain f(a) and D q spectra, as explained previously. The shape 
of f(u) is a hump that has a maximum value, a corresponding to this maximum may be 
associated with the general behavior of the series. So, this particular singularity exponent 
can be thought like the Hurst exponent H for the series as a whole. 



V. APPLICATION OF WTMM TO THE POLLEN TIME SERIES 

The airborne pollen concentration time series may be seen in Fig. 1. The third derivative 
of Gaussian function was chosen as analyzing wavelet: 

^\t) = ^(e* 2 / 2 ), (11) 

Twelve wavelet transform data files were obtained applying the Wavelet Transform with 
ip( 3 \ ranging the scaling factor a from a min = 1/256 to a max = 8 in steps of 2 n . To give an 
idea of the effect of the change of scale on wavelet transform of the pollen time series, three 
of them are shown in Fig. 2. 

Then we computed the partition function Z(q, a) for —30 < q < 30 and 1/256 < a < 8, 
getting r(q), as shown in Fig. 3. 

r{q) is a nonlinear convex increasing function with r(0) = —0.97 and two asymptotic 
slopes which are a m in = 0.40 for q > and a max = 1-39 for q < 0. 

This lays the corresponding f(a) singularity spectrum obtained by Legendre transform- 
ing r(q) that is displayed in Fig. 4. The single humped shape with a nonunique Holder 
exponent obtained characterizes a multifractal. 

As expected from r(q), the support of f{ot) extends over a finite interval which bounds 
are a m i n = 0.40 and a max = 1.39. The minimum value, a m i n , corresponds to the strongest 
singularity which characterizes the most rarified zone, whereas higher values exhibit weaker 
singularities until a max or weakest singularity which corresponds to the densiest zone. a m i n < 
1/2 corresponds to an antipersistent process and a max > 1, to a regular process. 

The D q spectrum obtained from r(q) can be seen in Fig. 5. The support dimension 
D Q = D max = — r(0) = 0.97; which implies that the capacity of the support is approximately 
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1; i.e. the support is not a fractal. D q converges asymptotically to a m i n = 0.40 for q max and 
to a max = 1.38 for q min . 

The Holder exponent for the dimension support, a(D max ), is 0.90. This particular a 
corresponds to f(a) max or D max which implies that the sucesses with a = a(D max ) = 0.90 
are the most frequent ones. 0.5 < a < 1 implies we are analyzing a persistent time series 
which obeys to the "Joseph Effect" (In the Bible refers to 7 years of loom, happiness and 
health and 7 years of hungry and illness) ||13|| . This system has long memory effects: what 
happens now will influence the future, so there is a very deep dependence with the initial 
conditions. It may be thought like a Fractional Brownian Motion of a > 0.5. A Hurst 
exponent of 0.90 describes a very persistent time series, what is expected in a natural process 
involved in an inertial system, a can be known as Holder Exponent or Singularity Exponent, 
too. If the distribution is homogeneous there is an unique a = H (for example Fractional 
Brownian Motion), but if it is not there are several exponents a. The most frequent a will 
characterize the series and will play as Hurst exponent. 

a— (a min + a max )/2 = a(D max ) = 0.90. This means that the curve is equally humped 
in both sides with the consequence of having the same inhomogeneity in the less frequent 
events associated with the q < branch and in the more frequent ones associated with the 
q > branch. 

The information dimension is Di = f(a(l)) = /(0.68) = 0.68 which features the scaling 
behavior of the information. It plays an important role in the analysis of nonlinear dynamic 
systems, specially in describing the loss of information as chaotic system evolves in time 
j23fl . D\ = 0.68 implies that we are in the presence of a chaotic system. 

The correlation dimension is D 2 = 0.60 which characterizes a chaotic attractor and is 
very close to the value previously obtained with the Grassberger-Procaccia method ||. 



VI. CONCLUSION 



The Wavelet Transform Modulus Maxima Method was applied to study the multifractal 
characteristics of an airborne pollen time series. We have found that pollen time series 
behave as a whole like long term memory persistent phenomena , as most ones in nature. 
The most common events associated with a max which correspond to low pollen concentration 
values behave in a persistent way as the whole series. On the other hand, the most rare 
events associated in the multifractal formalism to a m i n which correspond to highest pollen 
concentration values behave in an antipersistent way characterized by the "Noah Effect", 
changing suddenly and catastrophically the air conditions. Both the information and the 
correlation dimensions correspond to a chaotic system showing loss of information with time 
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evolution. 
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FIGURE CAPTIONS 



Fig. l.Two years of airborne pollen concentration time series. The time step is 2 hours. 
Fig. 2. Wavelet transform data of pollen time series distribution, a) scale a = 1, b) scale 
a = 1/8, c) scale a = 1/64. 

Fig. 3. r(q) spectrum of pollen time series. 

Fig. 4. f(a) spectrum of pollen time series, the support of f(a) bounds are a m i n = 0.40 
which corresponds to an antipersistent process and a max = 1.39 to a regular process. The 
Holder exponent for the dimension support, a(D max ), is 0.90 which characterizes a persistent 
process. 

Fig. 5. D q spectrum of pollen time series. The support dimension D = D(q = 0) ~ 1. 
D q converges asymptotically to a min = 0.40 for q max and to a max = 1.38 for q min . 
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